Fix fetch cache key collisions for Request and FormData bodies#332
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 13731560dc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
/bigbonk review |
commit: |
There was a problem hiding this comment.
Review Summary
The problem this PR addresses is real and the fix direction is correct: Request object bodies were silently ignored in cache key generation, and the old FormData serialization was ambiguous. Good test coverage for the new cases.
I found a few issues — one correctness concern, one memory safety gap, and two housekeeping items.
Issues Found
1. (Bug) Confusing double-clone and silent fallthrough for FormData parsing failure — see inline comments on lines 213 and 224.
2. (Memory) Full body buffered before size check for Request bodies — same as Codex flagged. See inline comment on line 224.
3. (Housekeeping) Cache key version not bumped
The FormData serialization format changed from key=val1,val2 to JSON.stringify([key, [{kind,value},...]]). The CACHE_KEY_PREFIX on line 40 should be bumped to "v3" to make the format boundary explicit and prevent ambiguity during rollbacks or gradual deployments. Not strictly required for correctness (old entries won't match the new format), but it's good hygiene.
4. (Minor) FormData.keys() iteration order
The extracted serializeFormData helper (line 95) iterates FormData keys via new Set(formData.keys()). Iteration order is insertion-order in all major engines but not spec-mandated. Two FormData objects with the same entries appended in different orders will produce different cache keys. This is arguably correct (different wire order = different payload), just worth being aware of. The new Set() dedup combined with getAll() is the right approach.
5. (Test gap) Already-consumed Request body
There's no test for the case where a Request body has already been consumed (locked/disturbed stream). Currently input.clone() inside serializeBody would throw. It might be worth adding a test documenting this behavior — either it throws and the caller skips caching, or it should be handled gracefully.
| const call = fetchMock.mock.calls[0]; | ||
| const forwardedRequest = call[0] as Request; | ||
| expect(forwardedRequest).toBeInstanceOf(Request); | ||
| expect(await forwardedRequest.text()).toBe("request-body-content"); |
There was a problem hiding this comment.
Good test. One edge case worth considering for future coverage: what happens when the Request body has already been consumed (i.e., input.body is a locked/disturbed stream)? Currently input.clone() inside serializeBody would throw. A test documenting whether that throws or gracefully skips caching would be helpful.
|
Review posted successfully. Here's a summary of the feedback: Key findings:
|
…ntent-length check + regression test
|
/bigbonk re-review please |
There was a problem hiding this comment.
Review Summary
This PR fixes two real correctness bugs in fetch cache key generation:
Requestobject bodies were silently ignored in cache key computationFormDatacomma-joined serialization was ambiguous/non-injective
The approach is sound — the fix stays within the existing architecture and only changes key generation. Good test coverage (16 new tests covering the matrix of Request bodies, FormData collisions, multipart boundary normalization, malformed form bodies, and body passthrough). All 71 tests pass.
I see the previous review from ask-bonk already flagged several items, and I can confirm that some have been addressed in subsequent commits (cache key version bump to v3, Content-Length pre-check, SkipCacheKeyGenerationError for malformed bodies). A few remaining observations below.
Overall this is solid work. The issues below are minor-to-medium.
| const boundedRequest = new Request(input.url, { | ||
| method: input.method, | ||
| headers: contentType ? { "content-type": contentType } : undefined, | ||
| body: new Blob(chunks.map((chunk) => chunk.slice())), |
There was a problem hiding this comment.
Unnecessary .slice() on chunks.
The chunks array was just read from a cloned request's stream in readRequestBodyChunksWithinLimit. These Uint8Array buffers are already owned by this function — nobody else holds a reference to them. The .slice() here creates a defensive copy of each chunk, but it's not needed since the buffers were freshly read from requestClone.body (which itself was cloned from input). The original input request body is untouched.
This is a minor allocation concern, not a bug.
| body: new Blob(chunks.map((chunk) => chunk.slice())), | |
| body: new Blob(chunks), |
| pushBodyChunk(JSON.stringify([key, { kind: "string", value: val }])); | ||
| continue; | ||
| } | ||
| if (val.size > MAX_CACHE_KEY_BODY_BYTES || getTotalBodyBytes() + val.size > MAX_CACHE_KEY_BODY_BYTES) { |
There was a problem hiding this comment.
File size check doesn't account for JSON serialization overhead.
The size check uses val.size (raw file bytes), but pushBodyChunk then calls JSON.stringify(...) which adds the key name, {kind: "file", name: ..., type: ..., value: ...} wrapper, and JSON escaping overhead. For a file right at the size limit, the serialized JSON string will be larger than val.size, potentially exceeding the limit without being caught by this pre-check.
In practice this is unlikely to matter (the pushBodyChunk function itself also checks totalBodyBytes against the limit), but the pre-check here is misleadingly precise. Not blocking — just noting the discrepancy.
| } | ||
| } | ||
|
|
||
| for (const chunk of chunks) { |
There was a problem hiding this comment.
Non-form Request bodies don't go through pushBodyChunk for size checking.
When a Request body has a non-form content type, the chunks are decoded and pushed directly into bodyChunks (line 315) without going through pushBodyChunk, which means totalBodyBytes is not updated and the per-chunk size limit in pushBodyChunk is never hit. Only the final trailing chunk on line 319 goes through pushBodyChunk.
This is partially mitigated by the streaming size check in readRequestBodyChunksWithinLimit (which enforces the 1 MiB limit during reading), so this won't cause an actual over-limit key. But it means totalBodyBytes is inaccurate for the rest of the function, which could matter if future code depends on it.
| for (const chunk of chunks) { | |
| for (const chunk of chunks) { | |
| pushBodyChunk(decoder.decode(chunk, { stream: true })); | |
| } |
| }); | ||
| const formData = await boundedRequest.formData(); | ||
| await serializeFormData(formData, pushBodyChunk, getTotalBodyBytes); | ||
| canonicalizedContentType = formContentType === "multipart/form-data" && contentType |
There was a problem hiding this comment.
Boundary stripping only applies to multipart/form-data, not application/x-www-form-urlencoded.
This is correct behavior (urlencoded doesn't have boundaries), but the conditional reads a bit awkwardly. Consider adding a brief comment explaining the asymmetry, since a reader might wonder why only multipart gets canonicalizedContentType treatment.
Also: when formContentType is "application/x-www-form-urlencoded", canonicalizedContentType stays undefined, so the original content-type header (including any charset=... params) remains in the cache key via collectHeaders. The test on line 525 ("urlencoded Request bodies with different charset headers get separate cache entries") confirms this works correctly. Good.
| const call = fetchMock.mock.calls[0]; | ||
| const forwardedRequest = call[0] as Request; | ||
| expect(forwardedRequest).toBeInstanceOf(Request); | ||
| expect(await forwardedRequest.text()).toBe("request-body-content"); |
There was a problem hiding this comment.
Good test, but it's testing body passthrough for cache misses only.
This verifies that input (the Request) still has a consumable body after cache key generation on a cache miss. Worth noting: on a cache hit, the input Request is never forwarded to originalFetch, so the body state doesn't matter. And on the stale-while-revalidate path (line 558 in fetch-cache.ts), input is forwarded to originalFetch for the background refetch — so the body also needs to be intact there. This test implicitly covers that since it's the same input reference, but an explicit test for the SWR path with a Request body would be stronger.
Not blocking — the current coverage is adequate for the fix.
| expect(fetchMock).toHaveBeenCalledTimes(1); | ||
| }); | ||
|
|
||
| it("malformed multipart Request bodies bypass cache instead of hashing raw bytes", async () => { |
There was a problem hiding this comment.
Good test for malformed multipart fallback behavior.
This confirms that malformed multipart bodies (where the boundary in the header doesn't match the body) bypass cache entirely (via SkipCacheKeyGenerationError) rather than falling through to raw-byte keying. This directly addresses the concern from the previous review about unstable cache keys from boundary-dependent raw bytes.
|
Review submitted successfully. Here's a summary of my feedback on PR #332: Overall assessment: The PR fixes two real correctness bugs and the approach is sound. All 71 tests pass. The previous review's main concerns (cache key version bump, Content-Length pre-check, malformed body fallthrough) have been addressed in subsequent commits. Remaining observations:
|
|
/bigbonk re-review this pr please |
There was a problem hiding this comment.
Review Summary
This PR fixes two real correctness bugs in fetch cache key generation:
Requestobject bodies were silently ignored, causing distinct POST payloads to share cache entriesFormDatacomma-joined serialization was ambiguous/non-injective
The approach is well-structured: body extraction uses request.clone() to preserve the original for forwarding, Content-Length pre-check avoids unnecessary cloning for oversized bodies, malformed multipart bodies correctly skip caching via SkipCacheKeyGenerationError instead of falling through to raw-byte keying, and the cache key version was bumped to v3.
The test coverage is thorough — 16 new tests covering Request bodies (distinct, identical, FormData with comma collisions, multipart boundary normalization, malformed multipart, charset handling), init FormData improvements (comma collisions, entry order, file metadata), body passthrough verification, already-consumed body handling, oversized Content-Length fast-path, and SWR body preservation.
All 73 tests pass. The previous two review rounds' major concerns have been addressed. A few minor observations below — none are blocking.
| const boundedRequest = new Request(input.url, { | ||
| method: input.method, | ||
| headers: contentType ? { "content-type": contentType } : undefined, | ||
| body: new Blob(chunks as unknown as BlobPart[]), |
There was a problem hiding this comment.
Nit: the as unknown as BlobPart[] cast is unnecessary. Uint8Array[] is already assignable to BlobPart[] since Uint8Array extends ArrayBufferView which is a valid BlobPart. This compiles fine without the cast:
| body: new Blob(chunks as unknown as BlobPart[]), | |
| body: new Blob(chunks), |
| for (const chunk of chunks) { | ||
| pushBodyChunk(decoder.decode(chunk, { stream: true })); |
There was a problem hiding this comment.
Minor: when the Request body has a non-form content type, chunks are pushed directly via pushBodyChunk here (good — the prior review's concern about bypassing pushBodyChunk appears to have been addressed). However, totalBodyBytes is incremented inside pushBodyChunk based on encoder.encode(chunk).byteLength, which re-encodes the already-decoded string. Since readRequestBodyChunksWithinLimit already enforced the 1 MiB limit on the raw bytes, this double-encoding is safe but slightly wasteful for large bodies. Not a real issue — just noting for awareness.
| pushBodyChunk(JSON.stringify([key, { kind: "string", value: val }])); | ||
| continue; | ||
| } | ||
| if (val.size > MAX_CACHE_KEY_BODY_BYTES || getTotalBodyBytes() + val.size > MAX_CACHE_KEY_BODY_BYTES) { |
There was a problem hiding this comment.
Minor: the val.size pre-check here measures raw file bytes, but the actual data pushed to pushBodyChunk is a JSON string that includes the key name, {kind, name, type, value} wrapper, and JSON escaping. For files near the limit, the serialized JSON will be somewhat larger than val.size. This is effectively a non-issue because pushBodyChunk also enforces the limit on the serialized output, so worst case a slightly-over-limit file gets caught one call later. Just noting the asymmetry.
| const formContentType = getParsedFormContentType(contentType); | ||
|
|
||
| if (formContentType) { | ||
| try { | ||
| const boundedRequest = new Request(input.url, { | ||
| method: input.method, | ||
| headers: contentType ? { "content-type": contentType } : undefined, | ||
| body: new Blob(chunks as unknown as BlobPart[]), | ||
| }); | ||
| const formData = await boundedRequest.formData(); | ||
| await serializeFormData(formData, pushBodyChunk, getTotalBodyBytes); | ||
| canonicalizedContentType = formContentType === "multipart/form-data" && contentType |
There was a problem hiding this comment.
Worth a brief comment explaining the asymmetry: canonicalizedContentType is only set for multipart/form-data (to strip the boundary) but not for application/x-www-form-urlencoded (which has no boundary to strip). The charset and other params for urlencoded are preserved in the cache key via collectHeaders. The test at line 567 confirms this is correct, but a reader unfamiliar with the code might wonder why only multipart gets special treatment.
Non-blocking — just a readability suggestion.
|
Review submitted. Here's a summary of my assessment: Verdict: Approved This PR fixes two real correctness bugs:
The implementation is solid:
The previous two review rounds' major concerns (redundant clone, silent fallthrough, memory buffering, cache version bump) have all been addressed in subsequent commits. I left 4 minor non-blocking comments: an unnecessary type cast, a note on double-encoding in the non-form Request path, a note on the file size pre-check asymmetry, and a readability suggestion for the multipart-only canonicalization logic. |
Summary
Fix fetch cache key generation so cached requests are keyed by the actual effective request body, including when the body is provided on a
Requestobject.This also fixes ambiguous
FormDataserialization that could cause distinct payloads to collapse into the same cache entry.Problem
Vinext’s fetch cache key generation did not fully account for the request body in all supported call shapes.
Requestbodies were ignoredThe cache key logic correctly merged headers from both
inputandinit, but it only serializedinit.body. If the body lived on aRequestobject, vinext effectively treated the request as body-less for cache-key purposes.Example:
Before this change, both requests could produce the same cache key even though the payloads were different.
That makes cached POST-style fetches unsafe: a response generated for one payload can be reused for another payload.
Concrete failure modes include:
Multi-value
FormDatacould collideFormDatavalues were serialized by joining them with commas, which is ambiguous.Example:
These are different payloads, but they produced the same cache-key fragment.
Root Cause
The issue was in the fetch cache key builder, not in cache storage itself.
collectHeaders()already handledRequestinputs correctlybuildFetchCacheKey()usedserializeBody(init)serializeBody()only looked atinit.bodyRequestbodies were therefore omitted unless duplicated ininitFormDataentries were flattened with comma-joining, which is not injectiveIn other words, the cache key did not always represent the true effective request.
What Changed
Fetch cache keying
Requestobject bodies in cache key generationRequestinputs without mutating the original fetch behaviorFormDataserializationTests
Add regression coverage for:
Requestbodies producing distinct cache entriesRequestbodies reusing the same cache entryFormDatapayloads not collidingRequestbodies still being forwarded intact after cache-key generationExamples
Example 1: POST search requests
Before:
These could collide and reuse the wrong cached response.
After:
Example 2: Multi-value form submissions
Before:
These serialized to the same cache-key fragment.
After:
FormDatavalues are serialized in a structured formatWhy This Approach
This change fixes the root cause while keeping the existing caching model intact.
The principle is simple:
The patch stays within the current fetch cache architecture and only changes key generation and regression coverage.
Files Changed
packages/vinext/src/shims/fetch-cache.tstests/fetch-cache.test.tsTest Plan
Ran targeted regression coverage for the affected surface: